home *** CD-ROM | disk | FTP | other *** search
- Path: mail2news.demon.co.uk!genesis.demon.co.uk
- From: Lawrence Kirby <fred@genesis.demon.co.uk>
- Newsgroups: comp.lang.c
- Subject: Re: Using regexec for multiple patterns. How???
- Date: Thu, 29 Feb 96 13:28:40 GMT
- Organization: none
- Distribution: all
- Message-ID: <825600520snz@genesis.demon.co.uk>
- References: <4h2pdj$fui@mo6.rc.tudelft.nl>
- Reply-To: fred@genesis.demon.co.uk
- X-NNTP-Posting-Host: genesis.demon.co.uk
- X-Newsreader: Demon Internet Simple News v1.27
- X-Mail2News-Path: genesis.demon.co.uk
-
- In article <4h2pdj$fui@mo6.rc.tudelft.nl>
- koen@lr46pstn.lr.tudelft.nl "Koen D'Hondt" writes:
-
- >Hi All,
- >
- >I've been experimenting with regcomp and friends to do pattern-matching in
- >texts, but I've run into a few problems trying to search for multiple
- >occurances of a search pattern. The regcomp and regexec are the GNU stuff.
- >The system I'm using is Linux 1.3.63.
-
- comp.lang.c isn't the place to discuss regcomp since it isn't part of the
- C language. Try a GNU or Linux newsgroup or possibly comp.unix.programmer.
-
- ...
-
- >/* this causes a SIGSEGV when count > 1 */
- > for(i = 0; i < count && pmatch[i].rm_so != -1; i++)
- > {
- > bzero(buf,128);
-
- bzero isn't portable even among Unix systems. The ANSI function is called
- memset.
-
- > chPtr = &search_text[pmatch[i].rm_so];
- > strncpy(buf, chPtr, pmatch[i].rm_eo - pmatch[i].rm_so);
- > strcat(buf,"\0");
-
- This is a true horror. As you seem to be aware strncpy doesn't necessarily
- null terminate the result string. However strcat *requires* a properly
- terminated string (that's the only way it can find where the end of the string
- is). strcat(buf,"\0"); will either do nothing if it is given a properly
- formatted string or produce undefined behaviour (e.g. possibly crash) if
- not. Write the last 2 lines as:
-
- buf[0] = '\0';
- strncat(buf, chPtr, pmatch[i].rm_eo - pmatch[i].rm_so);
-
- > printf("%i: start at %i, end at %i (%s)\n",
- > i, pmatch[i].rm_so, pmatch[i].rm_eo, buf);
-
- While %i is supported by ANSI C %d is more commonly used since it is portable
- to older C compilers.
-
- --
- -----------------------------------------
- Lawrence Kirby | fred@genesis.demon.co.uk
- Wilts, England | 70734.126@compuserve.com
- -----------------------------------------
-